423 research outputs found

    Parameterized Algorithms for Partitioning Graphs into Highly Connected Clusters

    Get PDF
    Clustering is a well-known and important problem with numerous applications. The graph-based model is one of the typical cluster models. In the graph model generally clusters are defined as cliques. However, such approach might be too restrictive as in some applications, not all objects from the same cluster must be connected. That is why different types of cliques relaxations often considered as clusters. In our work, we consider a problem of partitioning graph into clusters and a problem of isolating cluster of a special type where by cluster we mean highly connected subgraph. Initially, such clusterization was proposed by Hartuv and Shamir. And their HCS clustering algorithm was extensively applied in practice. It was used to cluster cDNA fingerprints, to find complexes in protein-protein interaction data, to group protein sequences hierarchically into superfamily and family clusters, to find families of regulatory RNA structures. The HCS algorithm partitions graph in highly connected subgraphs. However, it is achieved by deletion of not necessarily the minimum number of edges. In our work, we try to minimize the number of edge deletions. We consider problems from the parameterized point of view where the main parameter is a number of allowed edge deletions. The presented algorithms significantly improve previous known running times for the Highly Connected Deletion (improved from cOsleft(81^kright) to cOsleft(3^kright)), Isolated Highly Connected Subgraph (from cOs(4^k) to cOsleft(k^{cOleft(k^{sfrac{2}{3}}right)}right) ), Seeded Highly Connected Edge Deletion (from cOsleft(16^{k^{sfrac{3}{4}}}right) to cOsleft(k^{sqrt{k}}right)) problems. Furthermore, we present a subexponential algorithm for Highly Connected Deletion problem if the number of clusters is bounded. Overall our work contains three subexponential algorithms which is unusual as very recently there were known very few problems admitting subexponential algorithms

    Communication-Efficient Collaborative Regret Minimization in Multi-Armed Bandits

    Full text link
    In this paper, we study the collaborative learning model, which concerns the tradeoff between parallelism and communication overhead in multi-agent multi-armed bandits. For regret minimization in multi-armed bandits, we present the first set of tradeoffs between the number of rounds of communication among the agents and the regret of the collaborative learning process.Comment: 13 pages, 1 figur

    An Exponential Lower Bound for Cut Sparsifiers in Planar Graphs

    Get PDF
    Given an edge-weighted graph G with a set Q of k terminals, a mimicking network is a graph with the same set of terminals that exactly preserves the sizes of minimum cuts between any partition of the terminals. A natural question in the area of graph compression is to provide as small mimicking networks as possible for input graph G being either an arbitrary graph or coming from a specific graph class. In this note we show an exponential lower bound for cut mimicking networks in planar graphs: there are edge-weighted planar graphs with k terminals that require 2^(k-2) edges in any mimicking network. This nearly matches an upper bound of O(k * 2^(2k)) of Krauthgamer and Rika [SODA 2013, arXiv:1702.05951] and is in sharp contrast with the O(k^2) upper bound under the assumption that all terminals lie on a single face [Goranci, Henzinger, Peng, arXiv:1702.01136]. As a side result we show a hard instance for the double-exponential upper bounds given by Hagerup, Katajainen, Nishimura, and Ragde [JCSS 1998], Khan and Raghavendra [IPL 2014], and Chambers and Eppstein [JGAA 2013]

    A Multi-labeled Tree Edit Distance for Comparing "Clonal Trees" of Tumor Progression

    Get PDF
    We introduce a new edit distance measure between a pair of "clonal trees", each representing the progression and mutational heterogeneity of a tumor sample, constructed by the use of single cell or bulk high throughput sequencing data. In a clonal tree, each vertex represents a specific tumor clone, and is labeled with one or more mutations in a way that each mutation is assigned to the oldest clone that harbors it. Given two clonal trees, our multi-labeled tree edit distance (MLTED) measure is defined as the minimum number of mutation/label deletions, (empty) leaf deletions, and vertex (clonal) expansions, applied in any order, to convert each of the two trees to the maximal common tree. We show that the MLTED measure can be computed efficiently in polynomial time and it captures the similarity between trees of different clonal granularity well. We have implemented our algorithm to compute MLTED exactly and applied it to a variety of data sets successfully. The source code of our method can be found in: https://github.com/khaled-rahman/leafDelTED

    Creation and Characterization of Mycolicibacterium Smegmatis mc2155 with Deletions in Genes Encoding Sterol Oxidation Enzymes

    Get PDF
    The fast-growing saprotrophic strain Mycolicibacterium smegmatis mc2155 is capable of utilizing plant and animal sterols and can be used for creation of genetically engineered strains producing biologically active steroids. Oxidation of the 3β-hydroxyl group and Δ5(6)→Δ4(5) double bond isomerization followed by formation of stenones from sterols are considered as the initial stage of steroid catabolism in some actinobacteria. The study of the mechanism of steroid nucleus 3β-hydroxyl group oxidation is relevant for the creation of a method of the microbiological production of valuable 3β-hydroxy-5-en-steroids. A mutant strain of M. smegmatis with deletions in three genes (MSMEG_1604, MSMEG_5228 and MSMEG_5233) encoding known enzymes exhibiting 3β-hydroxysteroid dehydrogenase activity was constructed by homologous recombination coupled with double selection. The resulting mutant retained macromorphological properties and the ability to convert cholesterol. 3-Keto-4-en-steroids were found among the sterol catabolism intermediates. Experimentally obtained data indicate the presence of a previously undetected intracellular enzyme that performs the function of 3β-hydroxysteroid dehydrogenase/Δ5(6)→Δ4(5) isomerase

    A Multi-Labeled Tree Dissimilarity Measure for Comparing “Clonal Trees” of Tumor Progression

    Get PDF
    We introduce a new dissimilarity measure between a pair of “clonal trees”, each representing the progression and mutational heterogeneity of a tumor sample, constructed by the use of single cell or bulk high throughput sequencing data. In a clonal tree, each vertex represents a specific tumor clone, and is labeled with one or more mutations in a way that each mutation is assigned to the oldest clone that harbors it. Given two clonal trees, our multi-labeled tree dissimilarity (MLTD) measure is defined as the minimum number of mutation/label deletions, (empty) leaf deletions, and vertex (clonal) expansions, applied in any order, to convert each of the two trees to the maximum common tree. We show that the MLTD measure can be computed efficiently in polynomial time and it captures the similarity between trees of different clonal granularity well
    • …
    corecore